Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Upgrade CI #14635

Merged
merged 16 commits into from
May 4, 2023
Merged

[CI] Upgrade CI #14635

merged 16 commits into from
May 4, 2023

Conversation

yongwww
Copy link
Member

@yongwww yongwww commented Apr 16, 2023

Component Current Version Upgraded Version
Docker of ci_cpu, gpu, arm, etc. 20.04 22.04
Docker of ci_i386 18.04 20.04 (No 22.04 so far)
CMake 3.18 3.20.0
LLVM - 13, 14, 15, 16
ROCM 4.3 5.3 (4.3 has unmet packages in upgraded container)
CUDA 11.7 11.8
Python 3.7 3.8
Pylint 2.4.4 2.17.2
Torch - 2.0
torchvision - 0.15.1
libtorch - 2.0
Golang 1.12 1.18
emsdk 2.0.7 3.1.30
Verilator 4.104 5.002
Numpy 1.21 1.23
DGL 0.7.2 1.0.0
clang-format 10 15

@tvm-bot
Copy link
Collaborator

tvm-bot commented Apr 16, 2023

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

Generated by tvm-bot

@tqchen
Copy link
Member

tqchen commented Apr 16, 2023

cc @masahi

Copy link
Contributor

@leandron leandron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yongwww, thanks for the PR. I see this is marked as work in progress.

Can I just ask the we validate the images in a full CI round before merging this PR, that not only they build the images correctly, but also that tests pass with the resultant images, as you're doing with #14651. These are a lot of changes for one PR, but I appreciate that moving versions is hard.

I'm marking this as "request changes" so that we don't accidentally merge this based on the local CI results.

@yongwww
Copy link
Member Author

yongwww commented Apr 18, 2023

Thanks, @leandron ! I agree that this pull request should be held off until #14651 turns green. I've marked it as a work-in-progress and will revisit it

@yongwww yongwww changed the title [WIP][CI] Upgrade CI [CI] Upgrade CI Apr 24, 2023
@Aleksei-grovety
Copy link
Contributor

Hi @yongwww, thanks for the PR. I saw you tried tensorflow==2.12.0 but it was rolled back. Have you considered tensorflow==2.10? For example, Vela 3.7.0 (recently updated) supports TensorFlow 2.10.

@yongwww
Copy link
Member Author

yongwww commented Apr 24, 2023

@Aleksei-grovety great observation! I rolled back the TensorFlow upgrade because the linking process failed for the TFLite runtime during the TVM build (lots of errors like "multiple definition ...", and "undefined reference to..."). This issues need specific update in the tflite runtime build, which I haven't had time to address. In fact, I think we should consider using TFLite runtime Python package (https://pypi.org/project/tflite-runtime/) provided by Google.

@masahi
Copy link
Member

masahi commented Apr 24, 2023

@yongwww Feel free to drop PT 2.0 update from this PR. I'm aware that some PT frontend tests are broken with 2.0. I can continue my PR to fix them, once we get Python 3.10 from here.

@yongwww
Copy link
Member Author

yongwww commented Apr 25, 2023

@masahi I have fixed some of the failed pt tests in #14651, and skip those tests if there is new unsupported op. We will need PT 2.0 in unity branch, it would be good to have it in this upgrade

@yongwww
Copy link
Member Author

yongwww commented Apr 27, 2023

The PR is ready for review, pls take a look. cc: @masahi @driazati @leandron @tqchen

Copy link
Member

@driazati driazati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Thanks @yongwww, the changes LGTM if the build from #14651 passes

@yongwww
Copy link
Member Author

yongwww commented Apr 28, 2023

CI #14651 is green

@tqchen
Copy link
Member

tqchen commented Apr 28, 2023

Thank you @yongwww for heroic effort!

@psrivas2
Copy link
Contributor

psrivas2 commented May 2, 2023

cc @leandron can you take another look at this PR? I think this is ready to be merged.

@tqchen
Copy link
Member

tqchen commented May 3, 2023

ping @leandron

@yongwww
Copy link
Member Author

yongwww commented May 3, 2023

@tvm-bot rerun

@driazati driazati dismissed leandron’s stale review May 3, 2023 04:22

cautiously dismissing since the comments have been addressed and this PR has been in flight for a while, we can address any concerns in follow ups

Copy link
Contributor

@leandron leandron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to merge this once #14651 CI is green. At the moment it is running, as the GitHub actions needed to be approved, which I just did.

I marked a few action I feel we need to take as follow-up PRs, can you just open those as issues so that we can pick them up asap. There no need to block this PR because the actual changes, but can you open the issues, please?

@@ -54,4 +48,4 @@ cd "$tmpdir"
git clone --branch "$repo_revision" "$repo_url" "$repo_dir"

cd "$repo_dir"/driver
scons install_prefix="$install_path" install
scons werror=False install_prefix="$install_path" install
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind opening an issue to report what is the error that makes this script now need werror=False?

I understand it will have some output, but can you add that output in the issue, so that somebody can pick that up to investigate?

Copy link
Member Author

@yongwww yongwww May 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got error like "error: loop variable 'pair' creates a copy from type 'const std::pair<const unsigned int, std::pair<long unsigned int, long unsigned int> >' [-Werror=range-loop-construct] for (const auto pair : passAgentIdRanges)", it should be a warning, so I added werror=False in scons install for that, I think the ideal solution is to modify the code in https://github.com/Arm-software/ethos-n-driver-stack to fix this warning. More details about the error info. please refer to the log https://ci.tlcpack.ai/blue/rest/organizations/jenkins/pipelines/tvm-docker/branches/PR-14635/runs/5/nodes/73/steps/129/log/?start=0 Probably we can create an issue in the arm repo?

docker/Dockerfile.ci_arm Show resolved Hide resolved
@tqchen
Copy link
Member

tqchen commented May 3, 2023

Going to merge this first in 24 hours so we can proceed with the updates on the other side

@tqchen tqchen merged commit 4d37a0a into apache:main May 4, 2023
@tqchen
Copy link
Member

tqchen commented May 4, 2023

Thank you @yongwww for heroic effort

@yongwww yongwww deleted the ci_upgrade branch May 5, 2023 03:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants